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!. Introduction 

A prototype MMPP Traffic Generator (TG) has been designed for testing of the 
COMSAT-supplicd SCAR II Fast Packet Switch. By gmerating packete distributed 
according to a Markov-Modulated Poisson Process (MMPP) model, it allows the 
assessment of the switch performance under traffic conditions that are more realistic than 
could be generated using the COMSAT-supplied Traffic Generator Module. The MMPP 
model is widely believed to model accurately real-world superimposed voice and data 
communications traffic. 

The TG was designed to be as much as possible of a “drop-in” replacement for the 
COMSAT Traffic Generator Module. The latter fit on two \ltera EPM7256EGC 192-pin 
CPLDs and produced traffic for one switch input port No board changes are necessary 
because it has been partitioned to use the existing board traces. The TG, consisting of 
parts “TGDATPROC” and “TGRAMCTL” must merely be reprogrammed into the Altera 
devices of the same name. However, the ‘040 controller software must be modified to 
provide TG ini tializati on data. This data will be given in St ction II. 

T.l High l evel Description 

Figure 1 depicts the inputs and outputs of the TG. Tables 1 and 2 list their 
functions and correspondence to signal names in the COMSAT documentation. Active 
low signals are indicated by the ‘BAR’ suffix (for TG signals) and by the ‘backslash I’ 
suffix (for COMSAT signals). Signal names were presjrved with minor exceptions. 
Numbers in COMSAT signal names indicate the generator index (one of eight possible). 
This is not the case in our design, even though the numbers were retained in some cases. 


BYTCLK 


Rl CDATAfl 5. .0^ 
RI D DATAfT .O^ 
VMED ATBARI1 5..0] 
VMEA DRBARtl5..Q^ 
VMECTRLBAR[4..0] 


MMPP 

TRAFFIC 

GENERATOR. 

(TG) 


TGDRDYBAR 

* 

TGDATA[7..0] 

1> 

TGUWFLGBAR 



R1CCE 
RID CE 


Rip_ADDR(13..0] 

^RlC_ADDl([13..0] 


Figure 1 : Traffic Generator Block Diagram 
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Table 1 : Inputs to the Traffic Generator 


Inputs 

Function 

Corresponding SCAR il 
Name 

BYTCLK 

20 Mhz byte clock (system clock) 

TG*BYTCLK 
< rG Module 6/20/94 page 2) 

RlC_DATA[15-0] 

Memory bank C data. Contains the state 
dwell and idle packet counts, statistically 
generated according to the user specified 
distribution 

$une 

(Channel 1 Traffic Generator, 
3 1/10/94 page 2) 

R1 D_DATA[7..0] 

Memory bank D data. Contains the route 
numbers statistically distributed according to 
the user specified distribution 

same 

(Channel 1 Traffic Generator, 
11/10/94 page 3) 

VMEDATBAR[15..0] 

VME Slave data output lines 

'^MEDATT15..0]M 
(Channel 1 Traffic Generator, 
1/10/94 page 3) 

VMEADRBAR[15..0] 

VME Slave address output lines 

VMEADRI15..0]\I 

(Channel 1 Traffic Generator, 
1/10/94 page 3) 

VMECTRLBARI4..0] 

VME Slave control output lines 

VMECTRL[4..0]M 
(Channel 1 Traffic Generator, 
1/10/94 page 3) 


Table 2: Outputs of the Traffic Generator 


Outputs 

Function 

Corresponding SCAR II 
Nome 

TGDATA[7..0] 

output data to ECL serialisation logic 

TG1DATA[7..0]\I 

(TG Module 1 1/10/94 page 3) 

TGDRDYBAR 

output data ready flag 

TG1DRDYM 

(TG Module 11/10/94 page 3) 

TGUWFLGBAR 

frame boundary indicator 

TG1UWFLGV 

(TG Module 1 1/10/94 page 3) 

R1C_CE 

memory bank C enable 

same 

(Channel 1 Traffic Generator, 
J 1/10/94 page 2) 

R1D_CE 

memory bank D enable 

same 

(Channel 1 Traffic Generator, 
11710/94 page 2) 

R1C_ADDR[13..0] 

memory bank C address 

same 

(Channel 1 Traffic Generator, 
11/10/94 page 2) 

R1D_ADDR[13..0] 

memory bank D address 

same 

(Channel 1 Traffic Generator, 
1 1/10/94 page 2) 
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T-2 External Interfaces 


An overview of the environment in which the TG will reside is given in Figure 2. 
All external interfaces function identically to the COMSAT design. The TG is controlled 
by the COMSAT dual '040 controller, which in turn is controlled by an ethemet- 
connected workstation. Direct control ot the TG is perform id by a 7256 CPLD called the 
VME Slave, which translates the '040 commands to the appropriate data and control 
signals required by the TG. Figure 3 depicts the interfaces to the memory banks and the 
VME Slave. Details of the type of data and control will be given below in Section II. 
Note that although the TG requires no modification of the COMSAl hardware, the 
workstation and/or ‘040 controller software must be modified to initialize the TG, as 
discussed below. 


DDPS 



Figure 2: TG Environment 



Figure 3 : Interfaces to Banks C, D and VME Slave 


f.l TG Operation 

The normal operating sequence of the TG consists of an initialization phase and a 
packet generation phase. During the initialization phase the VME slave controls the TG 
and must perform the seven functions listed in Table 3. Ft r ready reference. Table 4 lists 
the com binat ions of signals VMEADRBAR[1 2..91, VMECTRLBAR[4..0] and 
VMEDATB AR[1 5. .01 required. 
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Table 3: VME Slave Responsibilities During TG Initialization 
VME Slave Initialization Fanci ions 

1 . Send reset command to TG: assert VMEADRBAR9 and 
VMECTRLBAR4 for at least one period of BYTCLK 

2. Load the frame marker: assert VMEADRBAR10 and VMEADRBAR1 for 
at least one BYTCLK period with the frame marker c n VMEDATB AR[7 ..0] 

3. Load the test length count: assert VMEADRBAB.10 and VMEADRBARO 
for at least one BYTCLK period with the test length count on 
VMEDATB AR[15..0] 

4. Load the LFSR initializat ion count: assert VMEADRBAR10, 

VMEADRBAR1 and VMEADRBARO for at least o.ie BYTCLK period with 
the initializat ion count on VM£DATBAR[1 5..0] 

5. Load the congestion control throttles: assert VMEADRBAR11 and the 
combinations in Table 3b for at least two BYTCLKL periods with the throttle 
settings on VMEDATBAR[1 5-0]. Odd (even) thnttles must be placed on 
lower (upper) byte of VMEDATB AR. 

6- Load the congestion control enable bits: assert VMEADRB AR 1 2 for at 
least one BYTCLKD period with die enable bits on VMEDATB AR[7..0]. 
Port numbers and bit numbers in VMEDATBAR|7..0] correspond. 

7. Enable the TG and begin the test: asset VMEADRB AR9 and 

VMECTRLBAR3 for at least one period of BYTCLK 

Table 4: VME Signal Combinations Durin i Initialization 


VMEADR 

BAR112..01 

VME 

CTRL 

BAR 

14-01 

VMEDAT 

BAR[15..0] 

Function 

xxxlxxxxxxxxx 

lxxxx 

xxxxxxxxxxxxxxxx 

Reset TG 

xxlxxxxxxxxlx 

XXXXX 

xxxxxxxxdddddddd 

Loac the frame marker 

xxlxxxxxxxxxl 

xxxxx 

dddddddddddddddd 

Load die test length count 

xxlxxxxxxxxlO 

XXXXX 

dddddddddddddddd 

Load the LFSR initialization count 

xlxxxxxxxxxOO 

xxxxx 

eeeeeeeeoooooooo 

Load the congestion control 
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(o=odd, e=even) 

xlxxxxxxxxxOl 

xxxxx 

eeeeeeeeoooooooo 
(o-odd, eleven) 

xlxxxxxxxxxlO 

xxxxx 

eeeeeeeeoooooooo 
(o=odd, e=even) 

xlxxxxxxxxxl 1 

xxxxx 

eeeeeeeeoooooooo 
(o*=odd, e=even) 

lxxxxxxxxxxxx 

xxxxx 

xxxxxxxxdddddddd 

xxxl XXXXXXXXX 

xlxxx 

xxxxxxxxxxxxxxxx 


tbrotties for ports 1 and 2 

Load the congestion control 
throttles for ports 3 and 4 

Load the congestion control 
throttles for ports 5 and 6 

Load the congestion control 
throttles for ports 7 and 8 

Load congestion control enable 
bits 

Enable the T G and begin the test 


After the TG is enabled (the final step in the initialization phase), it enters the 
packet generation phase. This phase lasts as long as necessary to generate the total (i.e., 
busy and idle) number of packets specified by the VME Slave during initialization step 3. 
During the packet generation phase any steps in Table 3 car be executed at any time (e.g., 
step 1 would reset the TG). However, step 4 will not have any effect because the LFSR 
initialization count is ignored during packet generation. Once the required number of 
packets has been generated, the TG enters a reset state called PORSET (sec Figure 15), 
where it stays until the receipt of an enable command from he VME Slave. 


1.3.1 Frame and Packet Format 


The frame and packet structure used is the same is that used by COMSAT. A 
frame consists of a one-byte frame marker followed by four 58-byte packets (see SCAR 
Program Phase II Critical Design Review 4/14/94). A ] jacket consists of a five-byte 
header and a 5 3 -byte payload, with the header shown ill Figure 4. All bits of the header 
are set to zero except for the route number, which is a variate distributed according to a 
completely-arbitrary, user-programmable pdf whose inv erse distribution is stored in 
memory Hank D. Note that this feature (in conjunction with the congestion control 
throttle mechanism) gives us exceptionable flexibility in dealing with congestion and 
modeling traffic sources. Idle packets have a route number of all zeroes. The remaining 
53 bytes of the packet consist of all zeroes except for a time stamp in the two bytes 
immediately following the header. The stamp is a count of the number of BYTCLK 
transitions since the receipt of the enable signal. 


2 bits 

2 bits 

gbits 

2 bits 

4 bits 

ft bits 

8 bin 

3 bits 

Priority 

i 

j Size 

Route 

Number 

Type 

— 

SN 

■■ 

VCN ' 

Rx Station ID 

FEC 

constan 

i(aero) 

pdf 

(user prognmmnble) 



constant ( cero) 



Figure 4: Packet Header Fonr at 
1,3.2 The MN4PP and th* Packet Generation Process 
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Packets arc generated according to a Markov- Modulated Poisson Process 
(MMPP). This sto chas tic process is widely used in network performance research to 
model the bursty and correlative aspects of traffic which contains a mix of voice, video 


and data. 

The MMPP has two states: “busy” and “idle”. Tie former is characterized by 
higher rates of busy packet generation than the latter. T le interarrival times between 
busy packets in either state are Poisson distributed. The times spent in either of the states 
arc geometrically distributed. The number of packets (busy and idle) generated in cither 
state will henceforth be referred to as the “state dwell lime”. The state diagram for the 
MMPP is given in Figure 5, where X, , X, refer respectively to the Poisson parameters for 
the busy and idle state interarrival times, and a , p refer to the probabilities of leaving the 
busy and idle states, respectively. 



Figure 5: State Diagram for the MMPP 

TTje TG manages the state dwell time using a c muter which stores the total 
number m(OSmi 65535 ) of packets to be generated in die state. The counter is loaded 
upon state entry with an integer-valued geometric deviate, chosen from either the busy 
sta te geometric distribution or the idle state geometric distr button, depending on the type 
of state being entered. Both types of deviate are stored in memory bank C, whose layout 
is discussed below. 

The packet interarrival times are modeled by ancthcr counter (called the idle 
packet counter”) which contains the number u (0 S u -Z 65535) of consecutive idle 
packets that is to be generated before the next busy paeke . The counter is loaded upon 
state entry and after each busy packet generated while renaming in the same state. The 
load values are integer-valued Poisson deviates, chosen firo'n either the busy state Poisson 
distribution or the idle state Poisson distribution, depend ng on the state the TG is in. 
Both types of deviate are stored in memory bank C, whose layout will now be discussed.^ 

The generation of the required deviates is done using the “inverse distribution” 
method, in which variates from an arbitrary distribution lire obtained by evaluating the 
inverse of the distribution at uniformly-distributed values. Memory banks C and D store 
the inverse distribution values in look-up table fashior . A 32-bit LFSR generates 
uniform variates which address the banks. This look-up able approach is preferable to 
computational circuitry for this application because the memory is already available with 
die COMSAT hardware. This approach is also faster and user-programmable. 

Memory bank C is laid out as shown in Table 5. Bank D is used for the route 
number distribution. Its map is not shown because it is tri\ ial. Note that the distributions 
of the dwell timw and idle packet counts are represented to 12-bit accuracy (i.e., memory 
address width), instead of the 14 used for the route number distribution. 


OTO© 


saomosan hsllvm usad 


t98C S68 919 XVd 0S:iT 301 L6/JZ/0T 



9 


Table 5: Map for Memory Bank C 

Address R1C_ADDR[13..0] \ Pate 


lldddddddddddd 
lOdddddddddddd 
0 ldddddddddddd 
OOdddddddddddd 


I Busy state dwell times 
Busy state idl 3 packet counts 
Idle state dwell times 
Idle state idle packet counts 


A high-level block diagram of the TG is shown in F: gure 6, where it can be seen 


UijbnaVuiaM 



Figure 6: High-Level Block Diagram of the TG 

that the TG generates all 58 bytes of a packet simultaneovsly in parallel. Conversion to 
byte-serial format (required by the ECL logic) is performed by an output multiplexer 
MUX which gives very high speed. The complicated parts of the design (e.g., the route 
number generator/congestion control handler ROUTGEN2) with long critical paths run 
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at half clock (signal BYTCLKD) because simulation has shown that they are too slow for 
full speed and cannot be simplified. Fortunately, the commencement of new route 
number generation/congestion control takes place as soon a; the previous route number is 
output to the ECL interface. This fact allows as many as 57 byte times to be used for the 
process, which is more than enough. 

13.3 Simulation Results 

Figure A1 on the next page shows the bursty, stochastic nature of a sample packet 
output stream. The two bottom traces (ROUTS ENT and active-low IDLESEND_BAR) 
indicate packet (idle or busy) times and idle packet times, respectively, for a simulation 
sequence which begins in the busy state for seven packets, dwells in the idle state for 
eight packets, and then returns to the busy state for another 7 packets. The difference in 
idle packet density between the two states is clear from the wo traces. 

Figure A2 shows the same simulation with the congestion control bit enabled for 
output port 3, and its throttle set to 1. The “pre-congestion control” route number (i.e., 
the byte accessed from memory bank D) is FF. By obs<aving the trace of the output 
(TGDATA[7..0]) between the vertical lines it can be seen that the route fields in 
successive packets alternate between FF and 7F, indicating correct congestion control on 
output port 3 with a throttle of 1. 

II. TG Design Details 

The partitioning of the TG between the two CPLDs is shown in Figure 7. 


sio® 


saDanosaa aaxvM ns ad 


*9«C S88 9T9 XVd 19 =11 301 i6/TZ/0T 




gg«iiM^M;: ^cg qaf£ 





1U TGDATPRQC Details 

The upper module (TGDATPROC) functions are given in Table 6. Tables 7 and 
8 list the functions of the inputs and outputs, respectively. 


Table 6: Functions of the TGDATPROC Module 

TGDATPRQC Ftmctfoas 

1 . Generate packet bytes for ECL serialization ogic 

2. Compute packet time stamps 

3. Generate packet route number* with congest ion control 

4. Store congestion control throttle numbers 

5. Store frame marker 
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Table 7: TGDATPROC Inputs and ihcii Functions 


Inputs 

Function 

VMEDATBARl 1 5-0] 

initializing data from VME Slave (frame 
marker, congestion control enable bits and 
throttle settings for channels] 

VMEADRBARI15..0] 

addresses for setting of Irani*; marker, 
congestion bits and throttle s strings 

BYTCLKD 

BYTCLK divided-by-2 for congestion control 
(see R0UTGEM2 module below) 

BUSYSENDBAR 

initializes busy packet gener ation 

R1D_DATA[7..0] 

memory bank D data output lines which 
contain the statistically-generated routing 
cumber (before congestion f recessing) 

1DLESEND_BAR 

initializes idle packet genera Hon 

ENROUTEBAR 

starts the route number gene 'ation 

KESETBAR 

global reset for TG (see CN' TRUER below) 

BYTCLK 

system clock (20 Mhz) 

TMSTMPENABAR 

starts the timer (used to stan p the outgoing 
packets) 

ENADNCNTR_BAR 

starts the packet generation 


Table 8: TGDATPROC Outputs and their Functions 


Outputs 

Function 

TGDRDYBAR 

output data ready flag 

TGDATA[7..0] 

output data to ECL serialization logic 

ROUTDONE 

controller signal to acknowledge the 
completion of the route number congestion 
control adjustment process 

ROUTSENT 

controller signal to acknow edge the sending of 
the route number in foe outj nit stream 

TGUWFLGBAR 

frame boundary indicator 


The schematic of TGDATPROC is shown in Figure 8. The ROUTGEN2 
schematic is shown in Figure 9. ROUTGEN2 is re; sponsible for throttle storage, 
congestion control and the generation of the appropriately modified route number 
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accessed from memory bank D, which generates route nunbers according to the user- 
specified statistical distribution. 

The congestion control scheme is as follows. Tire VME Slave loads throttle 
settings for each of the eight switch output ports in ROUTGEN2 at any time, including 
reset. Throttle value n causes an idle packet to be insetted every n successive busy 
packets destined for that port. Throttle control is enabled snly when the throttle enable 
bit for that port is asserted. The throttle enable bits are loaded from the VME Slave at 
any time, and stored in ROUTGEN2. 

This process is implemented by a loopback scheme. Each bit of the routing 
number (accessed from the memory bank D) is sequentially tested. If a particular bit is 
set, then it is reset only if that port’s throttle is enabled and the number of consecutive 
busy packets destined for that port that have been already sent equals the throttle setting. 
The n umb er of packets already sent is called the “throttle count” and is maintain ed in 
ROUTGEN2 by a down counter. If reset is performed, the throttle count is restored to its 
original (VME Slave-provided) value. If not, the count is decremented. The route 
number computed by this process is output from the TG. 



Figure 8: Schematic of the TGDATPR<5C Module 
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Other TGDATPROC subsystems are straightforwrrd except for MUXCNTRL, 
whose text listing is shown in Figure 10. MUXCNTRL is designed to convert the output 
of the downcounter DNCNTR to appropriate select signals for the output multiplexer 
OUTMUX. The design is not trivial because of the necesfity to insert the frame header 
every four packets. 

SUBDESIGN muxcntrl 

( 

coimt[7..0] : INPUT; 

sel[2..0], routsent : OUTPUT; 

endpacket, endframe : OUTPUT; 

) 

BEGIN 

CASE count[J IS 

WHEN 0 => sciy = 7; 

endframe = VCC; 
endpacket = VCC; 

WHEN 52, 1 10, 1 68, 226 => selQ = 6; 

WHEN 53, 111, 169, 227 => sel[] = 5; 

WHEN 54, 1 12, 170, 228 selQ * 4; 

WHEN 55, 113, 171, 229 self) = 3; 

routsent = VCC; 

WHEN 56, 1 14, 172, 230 => sel[] = 2; 

WHEN 57, 115, 173, 231 =>selD= I; 

WHEN 58, 1 16, 174 ~> sel[] = 7; 

endpacket - VCC; 

WHEN 232 ->sel[]-0; 

WHEN OTHERS => selQ - 7; 

END CASE; 

END; 


Figure 10: Module OUTMUX Design 
IT 2 TGRAMCTL Details 

The lower module (TGRAMCTL) functions are gr'en in Tabic 9. Tables 10 and 
1 1 list the functions of the inputs and outputs, respectively. 
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Table 9: Functions of the TGRAMCTI . Module 


TGRAMCTL Functions 

1. Generate unif orm variates for addresses to memory banks C i nd D, thus procuring suitably 
distributed state dwell times and idle packet counts 

2. Control the movement between and the time in the Markov states (“bursty” and “quiet”). 
Also control the generation of idle packet sequences 

3. Control the generation of busy packets 

4. Control the initialization of the TG, and store the LFSR mitia lization count. 

5. Store the test length count 

6. Generate a clock signal BYTCLKD with half the frequency of BYTCLK for use by 
ROUTGEN2, which due to its loopback structure is slow. 

7. Perform all other control functions for the TG, for example r ;set, initialization, and 
shutdown 


Table 10: TGRAMCTL Inputs and then Functions 


Inputs 

Function 

BYTCLK 

system clock (20 Mh2) 

VMEADRBARJ15..0] 

addresses for storing LFSR j nihalization and 
length of test counts, as well as resetting and 
enabling the TG 

VMECTRLBAR[4..0] 

control bits for resetting and enabling the TG 

ENDFRAME 

flag indicating that the lust t yte in a frame has 
just been sent to the ECL serialization circuitry. 
Used by CNTRLER to perfc rm an orderly reset 
sequence 

ROUTDONE 

flag which when asserted in licates that the 
route number is being proce jsed by 
ROUTGEN2 according to the loopback 
congestion control proctxhu; discussed above. 

R1C_DATA[15..0J 

memory bank C data output lines which contain 
(at disjoint times) either the statistically- 
generated Markov state dwt U times, or the 
statistically-generated idle packet sequence 
counts 

ROUTSENT 

flag indicating that the rout*. number byte has 
just been passed to the KCL serialization 
circuitry. Used by CNTRL)2R to begin 
preparing another idle or busy packet 

VMEDATBAR(15..0] 

initializing data from VME Slave (test count 
length and LFSR initialization count) 
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Table 1 1: TGRAMCTL Outputs and thei: Functions 


Outputs 

Function 

R1CCE 

Memory bank C enable line 

R/IDCE 

Memory bank D enable line 

R1D_ADDR[13..0] 

Address lines for memory bank D. The lower 
eight of these lines are connected to selected 
LFSR outputs and thus contain uniform variates 
for generating the statistically-distributed 
routing numbers 

R1C_ADDR[13.0] 

Address lines for memory b*nk C. These lines 
are connected to selected LF SR outputs and 
thus contain uniform variate:* for generating the 
statistically-distributed Maikov stale dwell 
times and corresponding idlt packet counts 

RESETBAR 

TG reset signal generated by CNTRLER 

TMSTMPENABAR 

enable signal for the time sfe mp counter, 
generated by CNTRLER. 

ENADNCNTR_BAR 

enable signal for the do\</n c ounter which 
sequences MUXCNTRL ami OUTMUX in 
TGDATPROC; generated by CNTRLER 

TDLESEND_BAR 

enable signal for the generation of an idle 
packet. Generated by CNTRLER and sent to 
R0UTGEN2 m TGDATPROC 

EN_ROUTE_BAR 

■ 

enable signal to begin the loopback congestion 
control procedure discu-ssed above on the 
routing number within the F 0UTGEN2 
module in TGDATPROC; generated by 
CNTRLER 

BUSYSEND_BAR 

enable signal for the genera ion of a busy 
packet Generated by CNTRLER and sent to 
R0UTGEN2 in TGDATPR DC 

BYTCLKD 

BYTCLK dividcd-by-2 for ingestion control 
(sec R0UTGEN2 module a oove) 


The schematic of TGRAMCTL is shown in Figure L 1 . The LFSR schematic is 
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Figure 1 1 : Schematic of the TGRAMCTL Module 
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is shown in Figure 12. It is 32 bits long and has a period of 2” - 1 The tap locations 
which are used to f orm the lower 12 bits of the C memory bank address (see discussion 
below) and the lower eight bits of the D memory bank addr.jss (sec discussion below) are 
a$ indicated in the figure. 



Figure 12: LFSR Schematic Diagram 
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The four GENREG modules are identical “generic” 16-bit down counters whose 
schematic will not be shown because of its simplicity. The) function as the Markov state 
dwell time counter, idle packet counter, test length coulter and LFSR initialization 
counter, respectively reading top-to-b ottom on the TGRAMCTL schematic. 

The text for the CNTRLER module is given in Figuie 13. This module is a Finite 
State Machine (FSM) which is the controller for the TG. A Mealy type architecture (i.e., 
outputs change in response to input changes without chang ng state) was chosen because 
it is easier to understand and debug. The state diagram for CNTRLER is given in Figure 
14, and the dictionary for its output symbols is given in Table 12. The output PRELOC 
in the table represents output 1 in Figure 13. 


SUBDESIGN cntrler 
% 

%Renaming of inputs and outputs: 

%a=bytclk 

%b=reset 

%c=enable 

%d=tosdc 

%e=ipczero 

%f=tstcntzero 

%g=routsent 

%b»endframe 

%i=lfsrcntzero 

%j=routdone 

%k=bytclkd 

%l=toggle control for location 
%n=reset_bar 
%r=l f$rcntena_bar 
%s=enalfsr_bar 
%t=statechoose 
%v=ldsdc_bar 
%w=ldipc_bar 
%xi=tmstmpena_bar 
%y=idlesend_bar 
%z=en_route_bar 
%zl =enadncntr_bar 
%z3 = e cntdecr_bar 
%z4="=d ecripcbar 
%z5=busy send bar 
%z6=dwell_idlechoose 
% 


% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 

% 


a, b, c, d, e, f 


: TNP1JT; 
: INPUT; 
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n,r,s 

t, v, w, xi, y 
2, zl, z3, z4, z5, z6 


: OUTPUT; 
: OUTPUT; 
: OUTPUT; 
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) 

VARIABLE 

ss : MACHINE WITH STATES (porset, init, Apre, As, Bsl, Bs2, Bs, Cpre, 
Csend, Cs, Cpost, Cwait, Ds, Dsl, Dsend, preot); 

1 : NODE; 


BEGIN 

ss.clk - a; 
ss.reset = b; 

t = TFFE(VCC, a, VCC, VCC, 1); 


TABLE 

% current current 

% state input 

% 

ss, c,d,e,f,g,h,ij,k=> 


next current % 

state output % 

% 

ss, 1, n,r,s, v, WjXi.yAzl a 3 ,z4,z5,z6; 


porset, 0^f,x,x,x,x^c,x,x => 
porset, l,x,x,x,x,xpc,x,x => 
init, x,xa1,x,x ; w = > 
init, x,xaOax,0 ax => 
init, x,x,x,0,xa 1 ,x,x => 
Apre, x,x,x^xax,x,x,x => 

As, x,l axax,x,x,x => 

As, x,0,x,x,x,x,xax => 

Bsl, x,x,xaxaxax => 

Bs2, x,x^,xax,x,x,x => 

Bs, x,x,O,x,x,xax,0 => 

Bs, x,x,0,xaxax,1 => 

Bs, x a 1 ,x,xax AO -> 

Bs, x,x,lAXAXAl => 
Cpre, x,xax,xax ,x,x *> 
Csend, xp£,x^x,xpc,lpt-> 
Csend, x,xpc,XAX,x,0,x -> 
Cs, xaxaO A X A X “> 

Cs, x ,x,x,x, 1 ,xpc,x,x => 

Cpost, xax, 1 A x AXiX => 

Cpost, x,1a0,xax,x,x=> 
Cpost, x,0,1,0ax,x,x,0=> 
Cpost, x,0,1,0,x,xax,1 => 


porset, 0, 0.1.0,1,1,1, 1,1, 1, L L L L 
init, 0, 1,1, 1,1, 1,0, 1,1. 0,1, 1,1,1; 
preot, 0, 1,1, 1,1, 1,1, 1,1, 0, 1, 1, 1, 1; 
init, 0, 1, 0,0,1, 1,0, 1,1. 0, 1, 1, 1, 1; 
Apre, 1, 1, 1,0, 1,1,0, 1,1, 0, 1, 1, 1, 1; 
As, 0, 1, 1,1,0, 1,0, 1,1, 0,1, 1,1,1; 
Apre, 1, 1 ,1,0, 1,1,0, 1,1, 0, 1, 1, 1, 1; 
Bsl, 0, 1, 1,0,1, 1,0, 1,1, 0, 1, 1, 1, 1; 
Bs2, 0, 1,1,1,1,0,0, 1,1,0, 1,1, 1,0; 
Bs, 0, 1,1,1, 1,1,0, 1,1,0, 1,1,1, 1; 
Cpre, 0, 1,1,1, 1,1,0, 0,1. 0,1, 1,0,1; 
Bs, 0, 1,1,1,1,1,0, 0.1, 0,1,1, 0,1; 
Cwait, 0, 1,1,1, 1,1,0, 1,0,0, 1, 1,0, 1; 
Bs, 0, 1,1,1,1,1,0, 1,0, 0,1, 1,0,1; 
Csend, 0, 1, 1,1, 1,1,0, 0,0,0, 1, 1, 1, 1; 
Csend, 0, 1,1, 1,1, 1,0, 0,0, 0, 1, 1, 1, 1; 
Cs, 0, 1,1,1,1.1.0. 1,1, 0,1, 1,1,1; 
Cs, 0, 1,1,1,1,1,0,14,0,1,1,1,1; 
Cpost, 0, 1,1,1,14,0, 14,0,0,0, 1, 1; 
preot, 0, 1,1,14,1,1, 1,1,0, 1, 1, 1, 1; 
Apre, 1, 1,1,04,1,0, 1,1, 0, 1, 1, 1, 1; 
Cwait, 0, 1,14,1,1,0, 1,0,0, 1, 1,0, 1; 
Cpost, 0, 1,14,1,1,0, 1,0, 0, 1, 1, 0, 1; 


CSO® 


saaanosaa aaiVM osao 


me S68 9X9 XVJ 9s:il 301 I 6 /TZ/ 0 T 



22 


Cpost, x,0,0,0,x,x,xaO => 
Cpost, x,0,0,0,x,xax4 -> 
Cwait, x,x,x,xmW => 
Dsend, x,x y x,x,x,x r x, 1 ,x => 
Dscnd, x,x,x,xpt,x,x,0,x -> 
Ds, x,x^,x,0,x > x,x > x => 
Ds, x,x,x,x, 1 ,x,x,x,x -> 
Dsl , x ax, 1 ,x,x,xax => 
Dsl, x,1 5 x,0,x,xax,x=^> 
Dsl, x,0,x,0,x,x,x,xpc *> 
preot, x,xpc,x,x, 1 ax,x => 
preot, x,x,x,xaO»x,x,x => 


Cpre, 0, 14444, 0, 0,1, 0,1, 1,0.1; 
Cpost, 0, 14444,0, 0,1,0, 1, 1,0, 1; 
Dsend, 0, 1444,1,0, 1,0,0, 1, 1, 1, 1; 
Dsend, 0, 1444,1.0, 1,0, 0, 1, 1, 1, 1; 

Ds, 0, 1,1, 1,1,1, 0, 1,1, 0,1, 1,1,1; 

Ds, 0, 1, 1,1,1, 1,0, 1,1, 0, 1, 1, 1, 1; 

Dsl, 0, 1 5 1,1,1,1,0,1,1,0, (,1,1,1; 

preot, 0, 1, 1,1, 1,1,1, 1,1, 0, 1 , 1, 1, 1; 
Apre, 1, 1,1,04,1,0, 14,0, ], 1,1,1; 
B$l, 0, 1,1,044,0,14,0,1,1,1,1; 
porset, 0, 1,14444, 1,1, 1, - , 1, 1, 1 j 
preot, 0, 144 , 1 , 1 , 1 , 1 , 1 , 0 , , 1, 1, 1; 


END TABLE; 


END; 


Figure 13: Text of the CNTRLER Modile Design 
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Table 12: Dictionary of the Output Codes in Figure 10 


Code 

1 . 

2 . 

3. 

4. 

5. 

6 . 

7. 

8 . 

9. 

10 . 
11 . 
12 . 

13. 

14 . 

15. 


Signals Asserted 

RESET_BAR, ENALFSR_BAR 
TMSTMPENAJBAR, ENADNCNTR^BAR 
ENADNCNTR.BAR 

LFSRCNTENABAR, ENALFSRBAR, TMSTMPENAJBAR, ENADNCNTRBAR 
P R FT. f)T ENALFSR BAR, TMSTMPENA_BAR, EN ADNCNTRB AR 
LDSDC BAR, TMSTMPBNABAR, ENADNCNTR_ BAR 
ENALFSRBAR, TMSTMPENABAR, ENADNCNT i_BAR 
LDIPCjBAR, TMSTMPENA_BAR, ENADNCNTR_J.AR, DWELLJDLECHOOSE 
TMSTMPENAjBAR, 1DLESEND_BAR, ENADNCNTRJBAR, BUSYSEND_BAR 
TMSTMPENAjBAR, ENROUTEjBAR, ENADNCNTRJBAR, BUSYSEND_BAR 
TMSTMPENA_BAR, IDLESENDJBAR, ENROU r£ BAR, ENADNCNTRJBAR 
TMSTMPENAjBAR, ENADNCNTRJBAR, CNTDECRBAR 
TMSTMPENABAR, ENADNCNTRBAR, CNTDECRJBAR, DECR1PCBAR 
TMSTMPENAjBAR, ENROUTEBAR, ENADNCN' Tt_BAR 
none 


II.3 Future Work 

Opportunities for future work exist with the present implementation, as well as 
with a standard-cell ASIC implementation. 

1111 Opportunities With the Present TO Implementation 


There are three major enhancements that can be made. The first is the 
incorporation of different packet priorities and lengths. Memory bank D could store the 
user-programmable probability distributions for the priorities and lengths, in addition to 
the route number distribution that it now contains. The loss (2 bits) in pdf resolution 
would be inconsequential because the present value (14 bits) exceeds our needs. 

The second major enhancement is the widening of the Time Stamp Counter to 40 
bits. Its present length of 16 bits was chosen because it .illowed the design to fit in the 


920 ® 


saomosaa aaxvM as ad 


me S68 9X9 XVd SOL 26/TZ/OT 



25 


7256CPLDs. However, the length of the longest test is lass than a second. Widening the 

counter would increase the test length to over 15 hours, which is ample enough to study 
long-term effects of congestion handling procedures on mtny different mixes of packet 
types and priorities. The LFSR and its initialization count* t should also be widened, in 
order to guarantee long-term uniformity of the generated adiresses. 

The third major enhancement is the inclusion of extra stales in the Markov 
process. Three or more States may allow more accurate modeling of traffic and more 
thorough testing of the switch. The literature on multi-state MMPPs is scanty and this 
presents an opportunity for significant research and publicat on. 

These enhancements will require bigger CPLDs. The utilization percentages for 
the present design (88% and 47%) are high enough that turner modifications might make 
routing/fitting/placcment very difficult. Routing/fitting of the present design already is 
tight as evidenced by the fact that it consumed 40 man-hoirs and dictated placement of 
the clock division flip-flop on the device other than the one that needed it. The effect of 
the enhancements on the speed of the TG will be minimal because of the parallel 
architecture. 

IP -2 Opportunities for the Deve lopment of n Standard-Cell TQ I mplementat i o n 

A Standard-cell ASIC implementation of the TG would allow the incorporation of 
the enhancements mentioned above (as well as other;) without routing or board 
trace/space problems. It would also allow the testing of future switches which require 
higher speeds, and the multi-state Markov process would generate more realistic traffic. 
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